Scaling and precursor motifs in earthquake networks 
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A measure of the correlation between two earthquakes is used to link events to their aftershocks, 
generating a growing network structure. In this framework one can quantify whether an aftershock 
is close or far, from main shocks of all magnitudes. We find that simple network motifs involving 
links to far aftershocks appear frequently before the three biggest earthquakes of the last 16 years 
in Southern California. Hence, networks could be useful to detect symptoms typically preceding 
major events. 
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A fundamental open issue in the field of scismicity 
is whether earthquakes are to some extent predictable 
or not hi. There are conflicting points of view about 
this pj, hj. Nevertheless, phenomenological approaches 
have been used for some decades to formulate algorithms 
for earthquake prediction |3|, [3, la, @ ! sometimes based on 
the search for complex (long-range) correlations [3j ■ 

Insight into the issue of seismicity and maybe of earth- 
quake prediction can be obtained by measuring the cor- 
relations between any pair of earthquakes. One method 
to estimate the amount of correlation was put forward 
in Ref. (see also J8|), based on the statistical proper- 
ties of earthquakes. If epicenters are distributed with a 
fractal dimension df , the mean number of events within 
an area of radius I should scale as l df . According to the 
Gutenberg-Richter law |l|, the number of these events 
with magnitude > m is proportional to 10~ bm , with 
b sw 1. Of course, the number of these events is on aver- 
age also proportional to the time t we have been spend- 
ing to record them. Hence, globally the mean number of 
events scales with the size of the space-time-magnitudc 
window a,s n ~ Kt I0 _6m l d i\ where K is a constant re- 
lated to the seismic activity. When a new event j takes 
place, it defines a point of view from which one can as- 
sess whether past seismic events appear unusual or usual, 
with respect to their expected average number. Indeed, 
any pair of events (i,j), separated by a time interval Uj 
and a distance Uj , defines an expected number of events 
riij = Ktij 10 _6mi l- f , where rm is the magnitude of the 
first event. 

One finds small riij values when j occurs immediately 
after i, very close to i, and if i has a large magnitude. A 
very small riij value means that an event with magnitude 
rrii had very small probability to occur in the space-time 
window defined by event j. Since such a case should 
rarely take place at random, its actual occurrence tells 
us that i and j are correlated. Furthermore, the smaller 
is riij, the more unusual is event i "with respect to f\ 
the more i and j are correlated J9|, as it was argued 
in Ref. nj. Hence, one can adopt mj as a metric for 
quantifying correlations between events. On the basis of 
n; b j one can also build a network of earthquakes |7| by 



drawing an oriented link to a new event j only from the 
event i giving the smallest riij value (denoted as n*). In 
this pair, we call event i the "main shock" and j is the 
"aftershock" even if rrij > rrii [lfj . 

In this Letter we examine such earthquake correla- 
tion graphs by means of tools of network theory. We 
show that the notion of distance at the basis of the net- 
work construction underlies remarkable statistical scal- 
ing properties, which should reflect basic mechanisms 
of earthquake formation and propagation. We also find 
that some simple motifs (small pieces composed by a few 
nodes and links 11]) could constitute an interesting kind 
of precursor of major events. The study of the motif oc- 
currences is a strategy to understand the properties of 
the systems described by networks hjj. For example, 
it is currently believed that understanding the statistics 
of simple motifs in protein-protein interactions and tran- 
scription regulatory networks can help to understand the 
metabolism |lll ll2J . 

The catalog we have analyzed is maintained by the 
Southern California (SC) Earthquake Data Center |13j. 
Data in the period ranging from the 1st of January 1984 
to the 31st of December 2003, and earthquakes with mag- 
nitude m > m < = 3.0 are considered (8858 events). In 
the area covered by the catalog the Gutenberg-Richter 
law holds with b ~ 0.95 Q, and dt = 1.6 pj. Quanti- 
ties are always measured in MKS units. 

We examine the three-dimensional distribution of 
earthquakes, taking into account their epicenters (lat- 
itude and longitude) and depths, i.e., their hypocen- 
ters. The spatial separation between events is given 
by the Euclidean distance between their hypocenters, 
and the fractal dimension of hypocenters is supposed 
to be Df = 1 + df = 2.6. The metric we use is then 
K' h/10 i-tij. Links reliably denoting correla- 
with a suitable threshold n c hj, LL6| . 
In order to define a selection procedure independent of 
the constant K', here we use n c = (n*)/10, where (n*) 
denotes the average of all n* with i = 2, 3, . . . ,j — 1. 

If at most one incoming link per node is allowed, the 
network has the form of a growing tree |7| . We relax this 
constraint because we want a richer network structure, 
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FIG. 1: Log-log plot of the global distribution P(p) (circles), 
and of the distributions P[ mim2 )(p) generated by earthquakes 
with magnitude in ranges [7711,7712) (see legend). Two power- 
law regimes (with relative exponents) are evidenced by dashed 
straight lines. 



with abundance of motifs like triangles of linked nodes, 
which are usually associated with the presence of non- 
trivial correlations within networks |17j . Nearly optimal 
incoming links to a new event have riij slightly greater 
than their minimum value n* and are the first candidates 
to be added to the tree structure: hence, we choose to 
draw a link when n,j < n c and riy < <f> n* , with constant 
4> > 1 (this procedure is also suggested by the fact that 
data from catalogs have experimental errors). We set 
4> = 10, obtaining roughly 2 outgoing links per node, but 
other similar values do not considerably alter the results. 
Our analysis of the precursory phenomena is based on 
the statistics of the quantity 



Pij 



= if/ 10- 



b mi 



(1) 



which is the space-magnitude part of the metric values 
n^ associated with drawn links. In Fig. ^ we show its 
distribution P(p). In addition, we also plot the distri- 
butions of p relative to links departing from shocks in 
ranges of magnitudes [mi, 1712), denoted as P[ mi ,m 2 )(p)- 
Two distinct power laws appear in P(p) as well as in all 
P[m 1 ,m 2 )(p) considered. For p — > 0, P{p) ~ p~ a , with 
0.3. In the regime p — > 00 instead P(p) ~ p~^ , with 



Q 



1.55. Since all P, 



\m 1 ,m 2 ){p) are quite well overlapped, 
and the aftershock distances vary weakly with time after 
an event (not shown), a length l m = p x l D s = lQ( b / D f) m 
is a good unit for measuring the distance of aftershocks 
from an event of magnitude m. Thus, the exponent 
a = b/Df ~ 0.37 might justify the rescaling of after- 
shocks distances with a factor 10 CTm , as it was done in 
Ref. Q (cr =± 0.4 there). 

The distributions P(p) describes a property of indi- 
vidual correlations between pairs of earthquakes, from 
which we clearly see that two classes of aftershocks exist, 
corresponding to the two regimes of P{p)- A geophysi- 
cal explanation of these two regimes could be related to 



the hierarchical fault structure: possibly, small p are con- 
nected to the conventional aftershocks within the rupture 
area, while the high p region could be determined mainly 
by inter- fault aftershocks, which are also detected by our 
method. 

A wide area of aftershock activity, as quantified by a 
large p value, may be favored by high stresses within 
the crust, and hence may be related to the periods 
prior to strong earthquakes. During these periods, it is 
also reasonable to find complex correlations in the stress 
field |1S|. We have tested the possibility that these phe- 
nomena are highlighted by peculiar network motifs, i.e., 
by studying the local topological structure of the grow- 
ing network of earthquakes. Indications supporting our 
hypothesis can be found by modifying the notion of lo- 
cal clustering coefficient of a node, which is normally 
given by the fraction of triangles it forms with its neigh- 
bors |l7J. In order to meet our former requirements, 
the motifs we study here are special triangles (ST), in 
which the p value carried by the first link {i-k link in 
the Inset of Fig. [5J is larger than a given threshold pq. 
The special clustering coefficient of a new node j is then 
Cj = Aj/A™ ax , where A 3 is the number of ST it forms 
with its Kj main shocks, and A™ ax = Kj(kj — l)/2. By 
definition Cj = if k < 2. 

To show that ST may be precursors of strong events 
we proceed as follows: The first three years of the catalog 
are used to obtain an initial estimate of (n*). During the 
next year we just add links, to avoid possible problems 
arising from the analysis of a network where links to old 
events are lacking. Then, from the beginning of 1988, 
an algorithm analyzes the signal given by the C value, 
evaluated for each event when it takes place. When C > 
0, we start an integration of the C signal, called Ci, which 
is reset to zero if C = for a period To. Values Tq = 
60 days and po = 10 7 yield a reasonable overall rate of 
C > values (spikes < C < 1 in Fig. |2J, avoiding 
the saturation of C/, which is the signal that we think 
is somewhat proportional to the seismic hazard in the 
region. The periods when Cj is greater than a constant 
threshold Ch — 3 are declared as alarms. 

Figure[2]suggests that there is a relation between alarm 
times and the occurrence of the three biggest events in 
the catalog: for Landers event [m = 7.3, labeled with 
(A)], alarm would have started 9 weeks before its occur- 
rence, for Northridge [(B), m — 6.7] one had to wait 6 
weeks after the declaration of the alarm, while the alarm 
before Hector Mine [(C), m — 7.1] started 10 weeks in 
advance. Thus, they would have been predicted in the 
short term. The San Simeon event [(D), m = 6.5] instead 
was not within an alarm time, while an alarm was also 
declared in a period when the biggest event had m = 5.7. 

The spatial location of the precursor motifs is another 
interesting issue. Fig. 02 and Fig. 0] show the distribu- 
tion of ST giving rise to the alarms (i.e. when Ci > 0) 
before the three biggest events. In Fig. |31 small letters 
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FIG. 2: (Color online) Time series of event magnitudes (above, only m > 4 are shown) and of special clustering C of events 
(below). Landers (A), Northridge (B), Hector Mine (C), and San Simeon (D) are the four biggest events since 1988 in the 
catalog. The integrated signal Ci is shown as a dashed line, while the horizontal dot-dashed line represents the threshold value 
Ch — 3: when Ci > Ch, alarms are declared (shaded areas, yellow online). Inset: sketch of a triangle of linked events, which 
is "special" if p ik > p . 
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FIG. 3: (Color online) Location of big events (circles, same 
capital letters discussed in the text and in Fig. |2J, and of 
precursor patterns (ST), marked with the same letter (and 
color online) of the relative big shock. The three insets are 
enlargements of areas with ST. Color tones of the three links 
in a triangle follow the same order as in the Inset of Fig. [5] 
in particular the older link is darker. 



corresponding to the big event ones denote areas with 
ST, and three insets show enlargements of some of them. 
Excluding a cluster of ST which would have indicated 
the future location of Landers epicenter [Fig. [Sfil)], ST 
do not appear close to the location of the incoming big 
events, in agreement with the idea that the preparation 
of an earthquake is not localized around its future source 
(see J3| and references therein). 

A plausible explanation of both this delocalization of 
the precursor patterns with respect to the big shock 
and the relation between high p values and and strong 
earthquakes might come from the critical point sce- 
nario |l8lll9.1. in which a big event represents a finite time 
singularity 20] . Indeed, as in the theory of critical phe- 
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FIG. 4: Longitude and latitude of the last node of ST, dur- 
ing the period when Ci > before Landers, Northridge and 
Hector Mine events. The coordinates of the big events are 
plotted as dashed lines. There is a clear convergence of the 
ST to the Landers epicenter [see also Fig|3fil)]. 



nomena, a suitably defined correlation length shows a sin- 
gular behavior diverging prior to big earthquakes [a, |21| . 
This length is evaluated by a procedure which sums the 
distances between events which are not aftershocks. Due 
to our results, we believe that aftershock distances may 
be a complementary indicator of long range correlations, 
and in particular that relatively far aftershocks could be 
a typical symptom of an incoming strong earthquake. 
Notice that we obtain useful informations also from the 
statistics of the aftershocks of the numerous minor earth- 
quakes, in agreement with the idea that the latter are 
active players in seismicity 22]. 

To assess the stability of our simple algorithm, in Fig. [5] 
we have plotted an error diagram 23] where the fraction 
of events with m > m> that are not predicted is shown 
as a function of the fraction of alarm time. In the dia- 
gram, the performance of a random alarm declaration is 
represented by a line joining the point (0, 1) with (1, 0). 
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FIG. 5: (Color online) Error diagram. Symbols are associ- 
ated with geographic zones and eventually with a modified 
parameter (see text). The line represents the performance of 
a random alarm declaration. 



Starting from the point (n c = (n*)/10, 4> — 10, po = IO 7 , 
Ch = 3, To = 60 days, m < = 3.0, r7i> = 6.7) in the 
parameter space, we have varied one of the parameters 
per time, around its initial point, and plotted the relative 
performance in Fig. [SI One clearly see that the algorithm 
does better than a random alarm declaration, and that 
it is reasonably stable. 

The case illustrated in this paper shows that a trans- 
lation of issues of seismicity into a network problem can 
be a fruitful approach. In order to have further insight 
on this possibility, we have analyzed two other catalogs, 
centered around Northern California (NC) and Italy |24| . 
and covering the same time span of our SC catalog. We 
have used the same parameters of SC, but for NC we set 
m > = 6.5 to include both S. Simeon and Loma Pricta 
(1989, m — 7) events in the big shock list. The algo- 
rithm does not recognize any of the two NC big events 
(no alarms declared, see Fig. [HJ. In Italy we set p = IO 8 
and a shift of the magnitudes (m< = 2.5, to> = 5.8) 
is necessary in order to include the two largest events 
(Umbria 1997, m = 6 and Molise 2002, m = 5.9) in the 
big shock list and a considerable number of smaller ones 
in the analysis. In this case, 4/6 of the big events are 
predicted, including the two most disruptive ones, with 
a fraction of alarm time ps 0.13, as shown in Fig. 

In summary, by means of an appropriate metric quan- 
tifying the amount of correlation between earthquakes, 
aftershocks of any event can be identified. Aftershock 
distances from a shock of magnitude m are properly mea- 
sured by a length unit scaling as ifj°- 37m . This informa- 
tion has been combined with a study of the local topology 
of the growing network of earthquakes, to show that sim- 
ple motifs embodying links to unusually far aftershocks 
appeared frequently before Landers, Northridge and Hec- 
tor Mine events in Southern California. 
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